AIS: A Fast, Disk Space Efficient "Adaptable Installation System" Supporting Multitudes of Diverse Software Configurations
نویسندگان
چکیده
Efficiently installing and configuring large sets of computer systems is an important concern for system and cluster administrators. Current solutions usually follow one of the two approaches: an image-based install or a metadata-based custom install. Both approaches limit the opportunities for optimizing the installation time by coupling the system specification with the installation technique and ignoring the relationships between configurations over time (as they evolve with patches and new packages). The Adaptable Installation System (AIS) is a new model and implementation that attempts to address these shortcomings by taking a hybrid approach to client system installation. As in the metadata-based approach, it uses descriptors to express what the final system should look like in terms of composition and configuration. At the same time, it uses imaging for part of the client reinstallation to achieve speed. In this paper we present the design and implementation of AIS along with details on the algorithm that builds images and performance results of running the prototype system on a set of RedHat based machines. Introduction and Motivation Installation Factors Speed of Installation Number of Configurations Available Server Storage Optimal Installation Approach 1 Not a factor Small Not a factor Custom, Image 2 Not a factor Small Factor Custom 3 Not a factor Large Not a factor Custom 4 Not a factor Large Factor Custom 5 Factor Small Not a factor Image 6 Factor Small Factor AIS 7 Factor Large Not a factor AIS 8 Factor Large Factor AIS Table 1: Factors that affect installation scenarios. Efficiently installing and configuring large sets of computer systems is an important concern for system and cluster administrators. Numerous programs that facilitate automated and unattended installations have been created. Generally they follow one of the two approaches: an image-based install or a metadatabased custom install. Both techniques are widely used and effective in certain scenarios. However, they both have two main disadvantages: 1. They couple the specification of the system (golden-server, Kickstart or other configuration file) with the installation technique (over-network disk-image writing, package or OS install tool). This limits opportunities for algorithmic optimization and prevents using the best specification with the best technique in all cases. 2. They both assume a static model of system installation, meaning they assume installation is a singular, fairly heavy-weight event. Although some adaptation to more dynamic environments 2004 LISA XVIII – November 14-19, 2004 – Atlanta, GA 105 AIS: A Fast, Disk Space Efficient ‘‘Adaptable Installation System’’ . . . Mikhailov and Stanton such as Clusters-on-Demand [18] is possible, they do not capture the relationship between system configurations over time. Ta b l e 1 compares the applicability of the two approaches with respect to three typical considerations in large installation/cluster management. The right-most column indicates the preferred installation approach given the values of the three considerations that are shown in the columns to the left. When the speed of client re-installation is not critical, custom install is a better option because of its ability to scale to a large number of system configurations and modest storage requirements. A new system configuration can be supported by creating a corresponding metadata descriptor, i.e., a Kickstart file. Imaging remains a preferred approach when the speed of client re-installation is critical. However, as the number of configurations increase dramatically, it becomes time consuming to manage and requires large amounts of storage on the server. Figure 1: Stages in installation process. Both imaging and custom installation approaches fall short of simultaneously supporting fast client reinstallations, large number of system configurations and moderate storage usage. The Adaptable Installation System (AIS) is a new model and implementation that attempts to address these shortcomings by taking a hybrid approach to client system installation. As in the metadata-based approach, it uses descriptors to express what the final system should look like in terms of composition and configuration. At the same time, it uses imaging for part of the client re-installation to achieve speed. The need to support a large number of system configurations while simultaneously allowing fast client re-installations is motivated by the development of ‘‘utility’’ computing and ‘‘clustering on demand’’ services such as the Oceano [4, 10] and Cluster-OnDemand [18] projects. These projects’ business model is that an organization manages computer clusters on System Configuration A description of the composition of computer system software in terms of the packages installed on this system. In the case of an RPM-based distribution of Linux, a list of all RPMs installed on this system. System Configuration Descriptor (SCD) An XML file that describes system configuration at a class level. Configuration is described in terms of Installation Base and additional packages that make up the system. Installation Base A minimal, working system configuration. Typically consists of a kernel, C libraries, and a small set of common Unix programs. Host Configuration System Configuration, plus any modifications to operating system configuration files needed to achieve the desired host state. Host configuration can be achieved manually, by copying host-specific files from the AIS server, or by running tools such as Cfengine [6]. Host Configuration Descriptor (HCD) An XML file that specifies host configuration details. It references the SCD on which this host configuration is based. AIS Server The machine executing the AIS server code that calculates the contents of each image and builds the cached image files. This machine also hosts the image cache files and package files required for client installation. Table 2: Definitions of system components. behalf of its clients (whether internal or external) and guarantees an agreed upon level of performance. It maintains a large pool of client machines and dynamically reallocates them from one virtual cluster to another as necessary. A second motivation is the continued need to patch or upgrade, and occasionally 106 2004 LISA XVIII – November 14-19, 2004 – Atlanta, GA Mikhailov and Stanton AIS: A Fast, Disk Space Efficient ‘‘Adaptable Installation System’’ . . . downgrade, production servers. This cycle causes a continually increasing set of configurations that differ only by a few package versions. To maintain them as online images would require continually growing storage and management complexity. fc1 mozilla-1.4.1-17 other package . . . . . . Figure 2: Sample system configuration descriptor. Configuration 1 /etc/hosts /other/file . . . . . . Figure 3: Sample host configuration file. Figure 4: Adding a new system configuration descriptor requires image cache to be updated. AIS Operation To better understand how AIS operates, it is helpful to know the phases involved in client installation. They are shown in Figure 1. The terms that are used in the section are defined in Table 2. As seen is Figure 1, AIS operates on two fronts: the installation server and the client machine being installed. On the AIS Server, AIS maintains a repository of System and Host Configuration Descriptors and a repository of installation packages. With this, AIS can reconstruct on disk any system configuration using its descriptors. This is shown as Phase 1 of the overall process in Figure 1. Note that AIS does not store or recover application data files except for host configuration contained in the HCD. The data files can be maintained on network file servers or separate data disks. One such method is described in [17]. Furthermore, AIS creates and maintains a repository of images (Phase 2), which it uses as the first step in reinstalling a client machine. This step is depicted as Phase 3 of the overall process. The key to AIS operation is that it does not necessarily store images for all system configurations. Furthermore, an image does not have to correspond to any specific configuration. Instead, the content of the image cache is determined by an algorithm and the same image can be used for initiating a re-installation of more than one system configuration. The algorithm aims to minimize the client installation time. In the decision making process it considers all system configurations that might need to be installed and the disk space available for storing image files. Since the image does not necessarily correspond to an exact system configuration, AIS performs the necessary post imaging steps to arrive at the exact system configuration. This step is depicted as Phase 4 in Figure 1. On the client, AIS is a script that runs as soon as the machine boots. The script uses a third party imaging tool, Frisbee, to retrieve the image from the AIS Server and write it to disk. After this, the script performs additional package installations, as necessary, to achieve the desired system configuration. Finally it performs any host specific configuration. The following sections provide additional details about AIS operations on the Installation Server and the client machines. AIS Operations on the AIS Server
منابع مشابه
AIS - Fast, disk-efficient system installation for heterogeneous environments: How to improve on Imaging and Kickstart?
متن کامل
روش نوین خوشهبندی ترکیبی با استفاده از سیستم ایمنی مصنوعی و سلسله مراتبی
Artificial immune system (AIS) is one of the most meta-heuristic algorithms to solve complex problems. With a large number of data, creating a rapid decision and stable results are the most challenging tasks due to the rapid variation in real world. Clustering technique is a possible solution for overcoming these problems. The goal of clustering analysis is to group similar objects. AIS algor...
متن کاملHyLog: A High Performance Approach to Managing Disk Layout
Our objective is to improve disk I/O performance in multi-disk systems supporting multiple concurrent users, such as file servers, database servers, and email servers. In such systems, many disk reads are absorbed by large in-memory buffers, and so disk writes comprise a large portion of the disk I/O traffic. LFS (Log-structured File System) has the potential to achieve superior write performan...
متن کاملReliability assessment of power distribution systems using disjoint path-set algorithm
Finding the reliability expression of different substation configurations can help design a distribution system with the best overall reliability. This paper presents a computerized a nd implemented algorithm, based on Disjoint Sum of Product (DSOP) algorithm. The algorithm was synthesized and applied for the first time to the determination of reliability expression of a substation to determine...
متن کاملWESAS’00 Position Paper
Architecture-based software development has shown great promise in increasing the flexibility, adaptability, and reusability of software systems. A popular definition of a software architecture partitions a system into three key elements: components, connectors, and configurations. The connectors in a software architecture play an important role in determining how flexible and adaptable a softw...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2004